README: Replication for “Demand Shocks, Procurement Policies, and the Nature of Medical Innovation: Evidence from Wartime Prosthetic Device Patents” – RESTAT
Date written: February 17, 2023

This document provides instructions to replicate the results in “Demand Shocks, Procurement Policies, and the Nature of Medical Innovation: Evidence from Wartime Prosthetic Device Patents” by Jeffrey Clemens and Parker Rogers.
To replicate the results, download the full zipped replication package, unzip the folder, and execute"Master.do" in STATA version 16. Python 3 is required to run the entire do file, though this is only required for Tables Table C.2 and Table C.3 in appendix C. Comment out the python call from "Master.do" to skip.

REPLICATION DIRECTORY
--------------------
Keep the following files in the same folder to run the replication. The required file structure is already in place in the downloaded folder. Note that you need writing permission in the directory where you run the replication.
File Structure:
- “Master.do”- [Data] >> contains the data main data sets for replication of the results; More details provided below
- [DoFiles] >> folder contains main scripts for replication that are called from "Master.do"
- [Estimates] >> stores estimates to be used in generating figures- [Figures] >> folder contains figure outputs
- [LogFiles] >> folder contains stata logfiles
- [Tables] >> folder contains table outputs

REPLICATION CODE FILES
--------------------
This replication package contains the master do-file "Master.do" and a set of doilies, which arestored in the folder “DoFiles”:- “Master.do”: This do-file calls all sub-do-files needed to replicate all results and data of thepaper and appendix. 

### Main Text
- “ImportGoogleDataOnInventorGeographyFinal.do”: This do-file constructs merges information from various data sources from Berkes (2018).

- “WarsAndProthesisPatentsAnalysisOfCounts.do”:  Figure 1 (Panel A, Panel C), Figure D2, Figure D3, Figure D6, Table 3 (see output "NumberOfPatentsPoissonSpecRegs.tex" (Panel A), "NumberOfPatentsPoissonSpecRegsCIVOnly.tex" (Panel B), and "NumberOfPatentsPoissonSpecRegsWWIOnly.tex" (Panel C) in path "Tables/"), Table D.1 (see output "NumberOfPatentsLogSpecRegs.tex" (Panel A), "NumberOfPatentsLogSpecRegsCIVOnly.tex" (Panel B), and "NumberOfPatentsLogSpecRegsWWIOnly.tex" (Panel C) in path "Tables/").
- “ConfederateAndBritishPatentAnalysis.do”: This do-file generates Figure 1: Panel B, Panel D, Panel E, Panel F, Panel G | Table 4 (Column 5).

- “WarsAndProthesisPatentTrendsSynth.do”: Figure 2, Figure 3 , Figure D7, Figure D8, Figure D9, Figure D10, Figure D11 | Table 4 (Columns 2 and 4; see output "PvalueDataSetfulltableoutput.dta" in path "Data/pvalueresults/"), Table C1 (see "SummaryStatsByWarAndTreatmentMovingAve.tex" output), Table D4 (see do file line "cor cost simplicity adjustability appliances appearance comfort durability materials").

- “WarsAndProsthesisTraitsEstimatesAndPValuesFullBoom.do”: Table 4 (Columns 1 and 3; see output "simplediffpvaluesfulltableoutput.dta" in path "Data/simplediffresults/"), Table D3 (see output "FullSampleSummaryStatsAll.tex"), Table D6 (see output "CollSummaryStatsByWarAndTreatment.tex"), Table D7 (see output "CollSummaryStatsByWarAndTreatmentMED.tex"), Table D8 (see output "crudematchresultsdataTABLE.dta" in path "Data/crudematchresultsdata/)

### Appendix

- “WarsAndProthesisPatentsAnalysisOfCountsEventStudies.do”: Figure D1
- “AnanalysisOfCitationWeightedCounts.do”: Figure D4
- “arm_vs_leg.do”: Figure D5.
- “AnalysisOfHandCodedData1.do”: Table D.2, Table D.5
- "AppendixB.py": Figure B.3; Table B.1, Table B.2. Figure B.3 output in Figures folder; Other output in build results
- "get_production_weights.py": Table C.2, Table C.3. Output csv files are stored in the Tables>>Trait_Donor_Classes_Weights folder. We then hand entered the output into the tables.
- "CW_Synth_Perturbations.do": Table B.4 output, preliminary
- "WWI_Synth_Perturbations.do": Table B.4 output, preliminary
- "compile_synth_perturbations.py": Table B.4 output, final. Gathers output of .do files and outputs figure into Figures folder.


DATA SET DEFINITIONS, CONSTRUCTION, AND [VARIABLES]
--------------------

-all_patents_basicinfo.dta: This data is created in Berkes (2018). We obtained access to the data by contacting the author.
>>[patnum]: patent number
>>[ipc0]: first assigned International patent classification (IPC) code. Highest relevance.
>>[ipc1]: second assigned International patent classification (IPC) code. 
>>[ipc2]: third assigned International patent classification (IPC) code. 
>>[main_uspto]: USPTO patent classification by class/subclass
>>[pyear]: Patent priority year
>>[fyear]: Year the patent application was filed
>>[iyear]: Year the patent was granted
>>[imonth]: Month the patent was granted
>>[inv_fips1]: The FIPS identifier geolocating the first inventor's state + county of residence
>>[inv_state1]: The state of residence of the first inventor
>>[inv_country1]: The country of residence of the first inventor
>>[nclassgoogle1]: The USPTO numerical technological class of the invention
>>[nclassgoogle2]: The USPTO numerical technological subclass of the invention

-control_classes_CW.dta: This data represents our machine learning (ML) generated encodings of the economic traits of innovation across all technological classes during the Civil War era. The ML text analysis method is described in Appendix B. See Table 2 for the list of traits and their corresponding descriptions.
>>[patnum]: patent number
>>[class]: USPTO patent classification by class
>>[adjustability]: Indicator of whether or not the given patent emphasizes our adjustability trait. 
>>[comfort]: Indicator of whether or not the given patent emphasizes our comfort trait. 
>>[simplicity]: Indicator of whether or not the given patent emphasizes our simplicity trait. 
>>[materials]: Indicator of whether or not the given patent emphasizes our materials trait. 
>>[appearance]: Indicator of whether or not the given patent emphasizes our appearance trait. 
>>[cost]: Indicator of whether or not the given patent emphasizes our cost trait. 

-control_classes_WWI.dta: This data represents our machine learning (ML) generated encodings of the economic traits of innovation across all technological classes surrounding World War I. The ML text analysis method is described in Appendix B.
>>[patnum]: patent number
>>[class]: USPTO patent classification by class
>>[adjustability]: Indicator of whether or not the given patent emphasizes our adjustability trait. 
>>[comfort]: Indicator of whether or not the given patent emphasizes our comfort trait. 
>>[simplicity]: Indicator of whether or not the given patent emphasizes our simplicity trait. 
>>[materials]: Indicator of whether or not the given patent emphasizes our materials trait. 
>>[appearance]: Indicator of whether or not the given patent emphasizes our appearance trait. 
>>[cost]: Indicator of whether or not the given patent emphasizes our cost trait. 

-control_classes_KWSearch.dta: This data represents our simple keyword search generated encodings of the economic traits of durability and appliances across both wars. Keywords for these traits are listed in Figure A.1. These traits were not well-predicted using machine learning techniques—hence our use of keyword searches—due to the specificity of the trait to the prosthetic limb category.
>>[patnum]: patent number
>>[class]: USPTO patent classification by class
>>[durability]: Indicator of whether or not the given patent emphasizes our durability trait. 
>>[appliances]: Indicator of whether or not the given patent emphasizes our appliances trait. 
>>[vulcanizedrubber]: Indicator of whether or not the given patent emphasizes our vulcanized rubber trait. We did not use this trait in our analysis. 

-with_header_subclass_all.csv. A database programmatically collected by the authors that crosswalks patent classification schema to corresponding class and subclass descriptive names (instead of numerical indicators).
>>[subcat]: USPTO subcategory classification within the patent classification system. Subcat subsumes class and subclass
>>[class]: USPTO class within the patent classification system. Class subsumes subclass
>>[subclass]: USPTO subclass within the patent classification system
>>[class_title]: USPTO textual description of the given class
>>[Description]: USPTO textual description of the given subclass
>>[main_uspto]: USPTO class/subclass in the format of Berkes (2018)
>>[header_subclass]: USPTO textual description of the given nested subclass (subsumes more specific descriptions under "Description" variable
>>[header_subclass_name]: USPTO textual description of the given nested subclass

-with_header_subclass_formerge.dta: A subset of with_header_subclass_all.csv, keeping only the class/subclass map and the description of the subclass
>>[main_uspto]: USPTO class/subclass in the format of Berkes (2018)
>>[header_subclass]: USPTO textual description of the given nested subclass (subsumes more specific descriptions under "Description" variable
>>[header_subclass_name]: USPTO textual description of the given nested subclass

-cit_received_long.dta From Berkes (2018). Data obtained through author. Links each patent to all of the patents that cite them.
>>[patnum]: patent number
>>[cit_received_patnum]: patent number of the patent citing "patnum" 

-Prosthetics_adjust_names.dta. A manually coded correspondence between all prosthetic limb patents during our sample period and whether or not it was a patent for an artificial leg or arm.
>>[patnum]: patent number
>>[leg]: Indicator equal to one if the patent is detailing a prosthetic leg innovation
>>[arm]: Indicator equal to one if the patent is detailing a prosthetic arm innovation

-all_manual_encodings.dta. A manually coded dataset indicating whether a given patent emphasizes the economic traits of interest in our study, and a list of keywords justifying that encoding. We include all prosthetic limb patents during the era of interest and a subset of patents across different, randomly selected technological classes across both wars.
>>[patnum]: patent number
>>[class]: USPTO patent classification "class"
>>[keywords]: Keywords we identified that justify our encodings
>>[adjustability]: Indicator of whether or not the given patent emphasizes our adjustability trait. 
>>[simplicity]: Indicator of whether or not the given patent emphasizes our simplicity trait. 
>>[durability]: Indicator of whether or not the given patent emphasizes our durability trait. 
>>[appearance]: Indicator of whether or not the given patent emphasizes our appearance trait. 
>>[materials]: Indicator of whether or not the given patent emphasizes our materials trait. 
>>[cost]: Indicator of whether or not the given patent emphasizes our cost trait. 
>>[comfort]: Indicator of whether or not the given patent emphasizes our comfort trait.

-BothWarsAll_Final.xlsx. A manually coded dataset indicating whether a given patent emphasizes the economic traits of interest in our study, and a list of keywords justifying that encoding. We include all prosthetic limb patents during the era of interest and a subset of patents across different, randomly selected technological classes across both wars.
>>[patnum]: patent number
>>[class]: USPTO patent classification "class"
>>[keywords]: Keywords we identified that justify our encodings
>>[adjustability]: Indicator of whether or not the given patent emphasizes our adjustability trait. 
>>[simplicity]: Indicator of whether or not the given patent emphasizes our simplicity trait. 
>>[durability]: Indicator of whether or not the given patent emphasizes our durability trait. 
>>[appearance]: Indicator of whether or not the given patent emphasizes our appearance trait. 
>>[materials]: Indicator of whether or not the given patent emphasizes our materials trait. 
>>[cost]: Indicator of whether or not the given patent emphasizes our cost trait. 
>>[comfort]: Indicator of whether or not the given patent emphasizes our comfort trait.

-BothWarsText_Final.xlsx. The patent text associated with the patent numbers in "BothWarsAll_Final.xlsx". We include all prosthetic limb patents during the era of interest and a subset of patents across different, randomly selected technological classes across both wars.
>>[patnum]: patent number
>>[class]: USPTO patent classification "class"
>>[pat_text]: The cleaned text of the patent document that we scraped from the internet for the given patent number

-CW_Patent_Counts.csv A dataset that contains the number of British prosthetic limb patents from 1790—1869. Counts were derived from tabulations of pdf scans of historical patent documents.
>>[year]: Year
>>[count]: The number of prosthetic limb patents filed in the given year

-WWI_Patent_Counts.csv A dataset that contains the number of British prosthetic limb patents from 1900—1930. Counts were derived from tabulations of pdf scans of historical patent documents.
>>[year]: Year
>>[count]: The number of prosthetic limb patents filed in the given year

-Brit_LimbKeywordSearchEncodings.csv A dataset that contains the encoded economic traits of interest of British prosthetic limb patents from 1724—1930. Traits were encoded for patents for which we had textual information. We used a keyword search, instead of ML, because we did not have training data from Britain. However, keyword searches perform similarly to ML approaches.
>>[patnum]: patent number (Note: Under British system; patent numbers may be the same as US patents, but they are not the same patent because they are filed under different jurisdictions).
>>[year]: Year the patent was published
>>[Comfort]: Indicator of whether or not the given patent emphasizes our comfort trait.
>>[Adjustability]: Indicator of whether or not the given patent emphasizes our adjustability trait. 
>>[Simplicity]: Indicator of whether or not the given patent emphasizes our simplicity trait. 
>>[Cost]: Indicator of whether or not the given patent emphasizes our cost trait. 
>>[Appearance]: Indicator of whether or not the given patent emphasizes our appearance trait. 
>>[Materials]: Indicator of whether or not the given patent emphasizes our materials trait. 
>>[Appliances]: Indicator of whether or not the given patent emphasizes our appliances trait. 
>>[Durability]: Indicator of whether or not the given patent emphasizes our durability trait. 

-CW_Spanish_Pat_Counts.csv. A dataset that contains the number of Spanish prosthetic limb patents from 1850—1875. Counts were derived from tabulations of pdf scans of historical patent documents.
>>[Publication date]: Year the patents were published
>>[count]: The number of prosthetic limb patents published in the given year

-WWI_Spanish_Pat_Counts.csv. A dataset that contains the number of Spanish prosthetic limb patents from 1900—1930. Counts were derived from tabulations of pdf scans of historical patent documents.
>>[Publication date]: Year the patents were published
>>[count]: The number of prosthetic limb patents published in the given year


FIGURE/TABLE <-> DO FILE RELATIONSHIP
--------------------

The following .do files correspond to the indicated Figures and/or Tables:

Figure 1:
------
Panel A: WarsAndProthesisPatentsAnalysisOfCounts.do
Panel B: ConfederateAndBritishPatentAnalysis.do
Panel C: WarsAndProthesisPatentsAnalysisOfCounts.do
Panel D: ConfederateAndBritishPatentAnalysis.do
Panel E: ConfederateAndBritishPatentAnalysis.do
Panel F: ConfederateAndBritishPatentAnalysis.do
Panel G: ConfederateAndBritishPatentAnalysis.do

Figure 2:
------
Panel A, B, C, D: WarsAndProthesisPatentTrendsSynth.do

Figure 3:
------
Panel A, B, C, D, E: WarsAndProthesisPatentTrendsSynth.do

Table E.1:
------
Hardcoded using Google Patent Database, manufacturer names from Hasegawa(2012); Barnes (1865);Barnes and Stanton (1866); Houston and Joynes (1866), and patent data from Berkes (2018).

Table 1:
------
Hardcoded using data from Barnes and Stanton (1866); Hasegawa(2012); the Census of Manufacturing tabulations; and patent data from Berkes (2018)

Table 2:
------
Hardcoded by the authors according to assessments from reading Hasegawa(2012), Linker(2011), and patents extracted from Google Patent Database.

Table 3:
------
Generated in WarsAndProthesisPatentsAnalysisOfCounts.do

Table 4:
------
Column (1): WarsAndProsthesisTraitsEstimatesAndPValuesFullBoom.do
Column (2): WarsAndProthesisPatentTrendsSynth.do
Column (3): WarsAndProsthesisTraitsEstimatesAndPValuesFullBoom.do
Column (4): WarsAndProthesisPatentTrendsSynth.do
Column (5): ConfederateAndBritishPatentAnalysis.do

APPENDIX
------

Figure A2: WarsAndProthesisPatentTrendsSynth.do
------

Figure B3: AppendixB.py
------

Figure B4: Run Master.do, under section where it says "Appendix figure B.4", which runs "CW_Synth_Perturbations.do" and "WWI_Synth_Perturbations.do", and then run "compile_synth_perturbations.py"
------

Table B1: AppendixB.py
------

Table B2: AppendixB.py
------

Table C1: WarsAndProthesisPatentTrendsSynth.do
------

Table C2: get_production_weights.py generates a set of .csv files containing donor classes and corresponding weights for each trait and war. Hand-coded thereafter
------

Table C3: get_production_weights.py generates a set of .csv files containing donor classes and corresponding weights for each trait and war. Hand-coded thereafter
------

Figure D1: WarsAndProthesisPatentsAnalysisOfCountsEventStudies.do
------

Figure D2: WarsAndProthesisPatentsAnalysisOfCounts.do
------

Figure D3: WarsAndProthesisPatentsAnalysisOfCounts.do
------

Figure D4: AnanalysisOfCitationWeightedCounts.do
------

Figure D5: arm_vs_leg.do
------

Figure D6: WarsAndProthesisPatentsAnalysisOfCounts.do
------

Figure D7: WarsAndProthesisPatentTrendsSynth.do
------

Figure D8: WarsAndProthesisPatentTrendsSynth.do
------

Figure D9: WarsAndProthesisPatentTrendsSynth.do
------

Figure D10: WarsAndProthesisPatentTrendsSynth.do
------

Figure D11: WarsAndProthesisPatentTrendsSynth.do
------

Table D1: WarsAndProthesisPatentsAnalysisOfCounts.do
------

Table D2: AnalysisOfHandCodedData1.do
------

Table D3: WarsAndProsthesisTraitsEstimatesAndPValuesFullBoom.do
------

Table D4: WarsAndProthesisPatentTrendsSynth.do
------

Table D5: AnalysisofHandCodedData1.do
------

Table D6: WarsAndProsthesisTraitsEstimatesAndPValuesFullBoom.do
------

Table D7: WarsAndProsthesisTraitsEstimatesAndPValuesFullBoom.do
------

Table D8: WarsAndProsthesisTraitsEstimatesAndPValuesFullBoom.do
------

The file "Master.do" will run each of these .do files to produce the given tables and figures in the "Tables" and "Figures" folders.